Minimising semantic drift with Mutual Exclusion Bootstrapping
نویسندگان
چکیده
Iterative bootstrapping techniques are commonly used to extract lexical semantic resources from raw text. Their major weakness is that, without costly human intervention, the extracted terms (often rapidly) drift from the meaning of the original seed terms. In this paper we proposeMutual Exclusion bootstrapping (MEB) in which multiple semantic classes compete for each extracted term. This significantly reduces the problem of semantic drift by providing boundaries for the semantic classes. We demonstrate the superiority of MEB to standard bootstrapping in extracting named entities from the GoogleWeb 1T 5-grams. Finally, we demonstrate that MEB is a multi-way cut problem over semantic classes, terms and contexts.
منابع مشابه
Weighted Mutual Exclusion Bootstrapping for Domain Independent Lexicon and Template Acquisition
We present the Weighted Mutual Exclusion Bootstrapping (WMEB) algorithm for simultaneously extracting precise semantic lexicons and templates for multiple categories. WMEB is capable of extracting larger lexicons with higher precision than previous techniques, successfully reducing semantic drift by incorporating new weighting functions and a cumulative template pool while still enforcing mutua...
متن کاملExperiments in Mutual Exclusion Bootstrapping
Mutual Exclusion Bootstrapping (MEB) was designed to overcome the problem of semantic drift suffered by iterative bootstrapping, where the meaning of extracted terms quickly drifts from the original seed terms (Curran et al., 2007). MEB works by extracting mutually exclusive classes in parallel which constrain each other. In this paper we explore the strengths and limitations of MEB by applying...
متن کاملGraph-based Analysis of Semantic Drift in Espresso-like Bootstrapping Algorithms
Bootstrapping has a tendency, called semantic drift, to select instances unrelated to the seed instances as the iteration proceeds. We demonstrate the semantic drift of bootstrapping has the same root as the topic drift of Kleinberg’s HITS, using a simplified graphbased reformulation of bootstrapping. We confirm that two graph-based algorithms, the von Neumann kernels and the regularized Laplac...
متن کاملRelation Guided Bootstrapping of Semantic Lexicons
State-of-the-art bootstrapping systems rely on expert-crafted semantic constraints such as negative categories to reduce semantic drift. Unfortunately, their use introduces a substantial amount of supervised knowledge. We present the Relation Guided Bootstrapping (RGB) algorithm, which simultaneously extracts lexicons and open relationships to guide lexicon growth and reduce semantic drift. Thi...
متن کاملLearning Semantic Lexicons using Graph Mutual Reinforcement based Bootstrapping
Bootstrapping has been received a amount of attentions in many fields and achieved good results. While semantic lexicons also have been proved to be useful for many natural language processing tasks. This paper presents an approach to learn semantic lexicons using a new bootstrapping method which is based on Graph Mutual Reinforcement. The approach uses only unlabeled data and a few of seed wor...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007